KiaDev Intelligence

#agentic reinforcement learning30/08/2025

rStar2-Agent: How a 14B Agentic RL Model Beats Bigger Models at Math

'Microsoft's rStar2-Agent integrates code execution into the reasoning loop, allowing a 14B model to outperform larger systems on math benchmarks with shorter reasoning traces.'

READ →